167 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
Czech English French German Hungarian Polish Spanish Swedish
Availability:
Freely Available
License:
CreativeCommons
Size:
2 MByte Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Document Translation vs. Query Translation for Cross-Lingual Information Retrieval in the Medical Domain
-
Paper track:Long/Information Retrieval and Text Mining
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shadi Saleh | Khresmoi Summary Translation Test Data 2.0 | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English French German
Availability:
Freely Available
License:
Size:
None sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture
-
Paper track:Short/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Christopher Brix | WMT14 | /N |
Documentation:
None
Written
Ontology,
Language Type:
Monolingual
Languages:
Chinese English French Japanese
Availability:
Freely Available
License:
MIT
Size:
345.4 MByte Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Neighborhood Matching Network for Entity Alignment
-
Paper track:Long/Information Extraction
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yuting Wu | DBP15K | /N |
Documentation:
There is a publicly available English documentation.
Written
Lexicon,
Language Type:
Multilingual
Languages:
English French German Italian Spanish
Availability:
Freely Available
License:
Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Size:
None Production Status:
Newly created-in progress
Use:
Word Sense Disambiguation
-
Paper title:Clu{BERT}: {A} Cluster-Based Approach for Learning Sense Distributions in Multiple Languages
-
Paper track:Long/Semantics: Lexical
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bianca Scarlini | CluBERT Distributions | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Bulgarian Croatian Czech Danish Dutch English Estonian Finnish French German Greek Hungarian Icelandic Irish Italian Latvian Lithuanian Maltese Polish Portuguese Romanian Slovak Slovenian Spanish Swedish
Availability:
Freely Available
License:
CC-0
Size:
341856530 sentences Production Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:ParaCrawl: Web-Scale Acquisition of Parallel Corpora
-
Paper track:Long/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Philipp Koehn | ParaCrawl | /N |
Documentation:
None
Written
Treebank,
Language Type:
Multilingual
Languages:
Chinese English French German Italian Japanese Russian Spanish
Availability:
Freely Available
License:
CreativeCommons
Size:
None Production Status:
Existing-used
Use:
Parsing and Tagging
-
Paper title:Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries
-
Paper track:Short/Machine Learning for NLP
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mozhi Zhang | Universal Dependencies | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
Chinese English French German Italian Japanese Russian Spanish
Availability:
From NIST
License:
Size:
None Production Status:
Existing-used
Use:
Document Classification, Text categorisation
-
Paper title:Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries
-
Paper track:Short/Machine Learning for NLP
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mozhi Zhang | Reuters RCV1/RCV2 Multilingual Corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
English French Italian Spanish
Availability:
Freely Available
License:
CreativeCommons BY NC ND 4.0 International
Size:
3370 <audio-transcript-translation> triplets OtherProduction Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marco Turchi | MuST-SHE | /N |
Documentation:
None
Written
Language Modeling Tool,
Language Type:
Monolingual
Languages:
French
Availability:
Freely Available
License:
MIT
Size:
1 GByte Production Status:
Newly created-finished
Use:
Language Modelling
-
Paper title:{C}amem{BERT}: a Tasty French Language Model
-
Paper track:Long/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Louis Martin | CamemBERT | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
French
Availability:
It's going to be uploaded on GitHub
License:
Size:
11834 tweets OtherProduction Status:
Newly created-in progress
Use:
Document Classification, Text categorisation
-
Paper title:He said "who's gonna take care of your children when you are at ACL?": Reported Sexist Acts are Not Sexist
-
Paper track:Long/Sentiment Analysis, Stylistic Analysis, and Argum
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Patricia Chiril | French Corpus for Sexism Detection | /N |
Documentation:
The current submission




